-
Notifications
You must be signed in to change notification settings - Fork 14k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KAFKA-13149: fix NPE for record==null when handling a produce request #11080
Conversation
This code https://github.com/apache/kafka/blob/bfc57aa4ddcd719fc4a646c2ac09d4979c076455/clients/src/main/java/org/apache/kafka/common/record/DefaultRecord.java#L294-L296 returns record=null, and can subsequently cause a null pointer exception in https://github.com/apache/kafka/blob/bfc57aa4ddcd719fc4a646c2ac09d4979c076455/core/src/main/scala/kafka/log/LogValidator.scala#L191 This PR lets the broker throw an invalid record exception and notify clients. The fix is similar to https://github.com/apache/kafka/blob/bfc57aa4ddcd719fc4a646c2ac09d4979c076455/clients/src/main/java/org/apache/kafka/common/record/DefaultRecord.java#L340-L358 where we throw an invalid record exception when the record's integrity is broken.
This PR is ready for review |
return null; | ||
throw new InvalidRecordException("Invalid record size: expected " + sizeOfBodyInBytes + | ||
" bytes in record payload, but instead the buffer has only " + buffer.remaining() + | ||
" remaining bytes."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this really an exceptional case? Don't we do reads where we don't know exactly where the read ends and hence will trigger this path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you saying the case that we are yet to complete reading the request? I didn't see a retry path, but it will cause a null point exception at
if (!record.hasMagic(batch.magic)) { |
What do you suggest I do here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the intent here was to cover the case where an incomplete record is returned by the broker. However, we have broker logic to try and avoid this case since KIP-74:
} else if (!hardMaxBytesLimit && readInfo.fetchedData.firstEntryIncomplete) {
// For FetchRequest version 3, we replace incomplete message sets with an empty one as consumers can make
// progress in such cases and don't need to report a `RecordTooLargeException`
FetchDataInfo(readInfo.fetchedData.fetchOffsetMetadata, MemoryRecords.EMPTY)
@hachikuji Do you remember if there is still a reason to return null
here instead of the exception @ccding is proposing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the case where an incomplete record is returned by the broker
I am referring to the produce API for the null pointer exception. The record is from a producer. The InvalidRecordException
will trigger a response to the producer.
If the fetch path requires a different return value, I guess the problem becomes more complicated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I understand you're talking about the producer case. I am talking about the fetch case. As I said, I think we may not need that special logic anymore, but @hachikuji would know for sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hachikuji do you have time to have a look at this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies for the delay here. I don't see a problem with the change. I believe that @ijuma is right that the fetch response may still return incomplete data, but I think this is handled in ByteBufferLogInputStream
. We stop batch iteration early if there is incomplete data, so we would never reach the readFrom
here which is called for each record in the batch. It's worth noting also that the only caller of this method (in DefaultRecordBatch.uncompressedIterator
) has the following logic:
try {
return DefaultRecord.readFrom(buffer, baseOffset, firstTimestamp, baseSequence, logAppendTime);
} catch (BufferUnderflowException e) {
throw new InvalidRecordException("Incorrect declared batch size, premature EOF reached");
}
So it is already handle underflows in a similar way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for checking @hachikuji.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@ccding I kicked off a new build since it has been a while since the PR was submitted. Assuming tests are ok, I will merge shortly. Thanks for your patience. |
The tests Jason kicked in failed two tests:
both worked on my local run with merging trunk to this branch. Pushing the trunk merge to this branch and let Jenkins to run it again. |
…equests (#11080) Raise `InvalidRecordException` from `DefaultRecordBatch.readFrom` instead of returning null if there are not enough bytes remaining to read the record. This ensures that the broker can raise a useful exception for malformed record batches. Reviewers: Ismael Juma <[email protected]>, Jason Gustafson <[email protected]>
…equests (apache#11080) Raise `InvalidRecordException` from `DefaultRecordBatch.readFrom` instead of returning null if there are not enough bytes remaining to read the record. This ensures that the broker can raise a useful exception for malformed record batches. Reviewers: Ismael Juma <[email protected]>, Jason Gustafson <[email protected]>
This code
kafka/clients/src/main/java/org/apache/kafka/common/record/DefaultRecord.java
Lines 294 to 296 in bfc57aa
returns record=null, and can subsequently cause a null pointer exception in
kafka/core/src/main/scala/kafka/log/LogValidator.scala
Line 191 in bfc57aa
This PR lets the broker throw an invalid record exception and notify clients. The fix is similar to
kafka/clients/src/main/java/org/apache/kafka/common/record/DefaultRecord.java
Lines 340 to 358 in bfc57aa
where we throw an invalid record exception when the record's integrity is broken.